Patagonian sheepdog: Genomic analyses trace the footprints of extinct UK herding dogs to South America

Most modern dog breeds were developed within the last two hundred years, following strong and recent human selection based predominantly on aesthetics, with few modern breeds constructed solely to maximize their work potential. In many cases, these working breeds represent the last remnants of now lost populations. The Patagonian sheepdog (PGOD), a rare herding breed, is a remarkable example of such a population. Maintained as an isolated population for over 130 years, the PGOD offers a unique opportunity to understand the genetic relationship amongst modern herding breeds, determine key genomic structure of the founder PGOD populations, and investigate how canine genomic data can mirror human migration patterns. We thus analyzed the population structure of 159 PGOD, comparing them with 1514 dogs representing 175 established breeds. Using 150,069 SNPs from a high-density SNP genotyping array, we establish the genomic composition, ancestry, and genetic diversity of the population, complementing genomic data with the PGOD’s migratory history to South America. Our phylogenetic analysis reveals that PGODs are most closely related to modern herding breeds hailing from the United Kingdom. Admixture models illustrate a greater degree of diversity and genetic heterogeneity within the very small PGOD population than in Western European herding breeds, suggesting the PGOD predates the 200-year-old construction of most pure breeds known today. We thus propose that PGODs originated from the foundational herding dogs of the UK, prior to the Victorian explosion of breeds, and that they are the closest link to a now-extinct population of herding dogs from which modern herding breeds descended.


Introduction
Modern dog breeds result from human selection for traits reflecting both aesthetic values and the behavioral needs of human populations [1]. While most modern breeds were developed in Western Europe during the Victorian age by fanciers [2], many working breeds were developed using a deliberate two-step process; initial selection for functional traits designed to accomplish specific tasks critical to human survival, such as herding, hunting, and protection, followed by more recent prioritization of nuanced morphological attributes [3][4][5]. Restrictive geography and the specific needs of particular human enterprises, such as the livestock or sheep industry, have heavily influenced genetic variation within and among herding dog populations [6]. Livestock dogs, with herding and guarding aptitudes, are a particularly interesting example as they are required to fill multiple roles: protection, guarding, and guiding agricultural populations [7].
We and others have hypothesized that migration and ancestry of canines mirrors the history of human populations and their movements [6,8]. The Patagonian sheepdog (PGOD), also called the "Barbucho" or "Ovejero Magallánico", is a working dog found in the Patagonian region of Chile and Argentina [9]. While recognized locally as a distinct, purpose-bred dog variety, the PGOD is not recognized by any formal dog breed registry, as is often the case with landrace breeds worldwide [6,10]. Historic documents indicate that the PGOD descended from working dogs brought by Scottish settlers who immigrated to Chilean Patagonia to develop sheep farming in the region, likely between 1877 and 1910 ( Fig 1A) [9,[11][12][13]. As official standards for individual collie-type breeds were not yet defined, dogs used in sheep farming in Great Britain at this time were simply known as "working collies" or "shepherd dogs" that had been adapted to terrain and climate ( Fig 1B) [9,14]. The geographical isolation of Patagonia, strong behavioral selection over the past 130 years, and little to no introduction of new breeding stock has resulted in a distinct population, uniquely adapted to a harsh environment and the needs of the Patagonian people [9,15,16] (Fig 1C and 1D).
In this study we performed phylogenetic and haplotype sharing analyses to investigate the relationship between the PGOD and modern herding breeds originating in the United Kingdom (UK). We explored genomic structure and ancestry to determine the relationship of the PGOD with modern dog breeds. Using a measure of shared haplotypes, we estimated the divergence time of the PGOD relative to other UK herding breeds to produce a timeline of Shetland sheepdog (SSHP), old English sheepdog (OES), Pembroke welsh corgi (PEMB), and Australian shepherd (AUSS). Within the 16 member UK rural clade, PGODs are monophyletic with the BORD and the Australian kelpie (KELP) (Fig 2A). In addition, we observed a geographic pattern in the phylogeny such that most PGODs from northern Patagonia (the Aysén region in Chile, including seven dogs from the cross-border region of Chubut, Argentina) are more closely related to the BORD (100% bootstrap value), while PGODs collected in southern Patagonia (Magallanes region in Chile) are more closely related to the KELP (100% bootstrap value) ( Fig 2B).
Interestingly, dogs from these two regions are not only geographically isolated from the rest of Chile (i.e., limited by channels and fjords to the north, the Pacific Ocean to the west, Argentina to the east and the Drake passage to the south) but are also separated by the north and south Patagonian Ice Fields ( Fig 1A). Yet, despite their geographical separation, both PGOD populations are similar in terms of morphology and behavior.

Shared haplotypes
To better understand the relationship between PGODs and the other breeds included in the phylogeny (Fig 2), we analyzed genomic similarity using identity by descent (IBD) methods to identify shared haplotypes among 176 dog breeds assigned to 23 clades (Fig 3). In this analysis, PGODs were divided into three groups based on their geographic sampling location: (i) Chubut, Argentina (AGOD), (ii) Aysén, Chile (YGOD), both located in northern Patagonia, and (iii) Magallanes, Chile (MGOD) in southern Patagonia. All three populations show significant levels of haplotype sharing with all herding breeds of the UK rural clade, except for the Australian cattle dog. Additionally, gene flow from German shepherd dog (GSD), a breed from the New World clade, to YGOD and AGOD populations was observed (Fig 3A and 3B and Neighbor-joining cladogram of genetic distances, based on 100 bootstraps, using published data from 176 breeds, including PGODs, and two wild canids [5]. In the cladogram, individual dogs are clustered by breed if their bootstrap value is 100%, and breeds are condensed into clades, as defined previously [5], with breed-to-breed bootstraps of >90%. Light blue indicates UK rural breeds. Individual PGODs are shown as branches in shades of purple through green indicating the sampling locations. (B) Locations from which PGODs were sampled are indicated with black circles. The geographical area is divided into segments of equal size, with samples assigned to one of eight regions (Map tiles by Stamen Design, under CC BY 3.0. Data by OpenStreetMap, under ODbL; http://maps.stamen.com/toner-lite/#6/-49.803/-71.213). Dogs collected in each segment of B are represented using the same color coding as in A. List of the breeds in Fig 2A with their abbreviations can be found in S1 Table. https://doi.org/10.1371/journal.pgen.1010160.g002 S2 Table). Haplotype sharing between the YGOD and Lupo italiano (LUPO) also reaches significance. However, we hypothesize that, rather than a recent admixture event in Patagonia between PGOD and LUPO, these results reflect the previously defined historical relationship between GSD and LUPO [4]. Based on these results we generated a new dataset with 11 breeds, nine from the UK rural clade plus the GSD and the PGOD. This dataset, with 247 dogs in total, is hereafter referred to as the "herding dog subset" (S1 Table).

Population structure
To better understand the genomic composition of the PGOD, we explored the population structure and degree of admixture between the PGOD and the ten breeds in the herding dog subset (Fig 3). We tested values of K ranging from 1 to 15, where K is the assumed number of populations [18]. The analysis showed the lowest cross validation (CV) error (0.57) (S1A Fig) for K = 8, suggesting eight as the most likely number of genetically distinct groups within the dataset. The plots obtained for the expected number of populations, K = 2-15, are shown in Fig 4A, with K = 15 being the number of assumed populations that allowed separation of each registered breed into its own cluster. Each color represents a group of breeds that belong to a previously defined clade [5]. Breeds that represent the UK rural clade are shown in light blue. The horizontal line indicates the level below which 95% of breed pairs from different clades share haplotypes. Breeds with median values above this line are determined to have significant haplotype sharing with the breed being analyzed, in this case, the specific PGOD populations. Breed abbreviations can be found in S1 Table. https://doi.org/10.1371/journal.pgen.1010160.g003 The admixture models illustrate the degree of diversity and variability within the PGOD population, which shows greater evidence of genetic heterogeneity compared to other herding breeds, as reflected by the continued mixed composition through increasing values of K ( Fig  4A). Furthermore, a clear geographic differentiation in genomic structure among the PGODs was observed. Comparing the genomic structure between PGOD populations for K = 8, we observed that the northern populations are comprised of genomic signatures consistent with PGOD, BORD, GSD and OES at levels of 41% (SD = 4%), 32% (SD = 3%), 9% (SD = 3%) and 7% (SD = 2%) respectively, for AGOD; while the levels are 42% (SD = 6%), 31% (SD = 8%), 9% (SD = 3%) and 8% (SD = 3%) respectively, for YGOD. Comparatively, the southern MGOD population is comprised of the same breed signatures but at levels of 72% (SD = 15%), 15% (SD = 8%), 5% (SD = 6%) and 3% (SD = 3%), respectively. These geographically-associated differences suggest that the southern population of PGODs shows overall a greater proportion of the genomic signature specific to PGODs (mean = 72%, SD = 15%) but greater variability in the minor ancestry components. In comparison, the population of northern Patagonia shows lower levels of the initial PGOD component (mean = 41%, SD = 4% for AGOD; and mean = 42%, SD = 6% for YGOD), but a more consistent pattern of component signatures among individuals.
Analysis of the ancestry composition of each individual PGOD, a group of 13 dogs from Tierra del Fuego, an island in southern Patagonia, showed very limited contribution from minor ancestry components (over 95% PGOD ancestry) relative to PGODs collected from the mainland of southern Patagonia (Fig 4A). We speculate that this subset of PGODs may represent a more historically accurate version of the breed, hence, we repeated the ADMIXTURE analysis using only the 13 most homogenous PGODs (S5 Table), hereafter referred to as the "homogenous PGODs" (Fig 4B), and the herding breed subset. We tested values of K ranging from 1 to 12. The lowest CV error (0.60) (S1B Fig) was obtained for K = 6 ( Fig 4B). Interestingly, this new admixture analysis identifies a common genomic signature prevalent in the PGOD, KELP, and BORD populations (92%, 91% and 95%, respectively), suggesting that these breeds all descend from the same ancestral population. This signature is also seen at 73% in BERD and 65% in AUSS ( Fig 4B).
An unrooted neighbor-joining tree was also built to demonstrate the relationship between the herding breed subset and the homogenous PGODs. The homogenous PGODs are monophyletic with the BORD, BERD, and KELP ( Fig 4C). A PCA was generated to examine the relationship of these homogenous PGODs with BORD, BERD, and KELP ( Fig 4D). The first two principal components (PCs) explain 9.9% and 8% of the total genetic variance between these closely related breeds, respectively. The first principal component (PC1) produced a major separation of PGOD from the rest of breeds, while PC2 separated KELP from BORD, BERD, and PGOD.

Effective population size and estimation of migration events
The effective population size (N e ) of each herding breed was estimated through SNP-based linkage disequilibrium (LD) analysis, considering a timeframe of 13 to 150 generations. To better represent the N e , PGODs were separated by region. Seven AGODs were available, which shows an N e of 30 at 13 generations, and 308 individuals at 150 generations. Random subsets of 10 dogs each from YGOD and MGOD were generated to calculate N e statistics (S3 Table). At 13 generations a mean (range) N e of 48 (47)(48)(49) individuals for YGOD and 47 (44-53) for MGOD were observed. While at 150 generations a mean (range) N e of 446 (437-451) individuals for YGOD and 419 (394-456) for MGOD was observed (S3 Table). The next comparable breeds are BORD and AUSS, showing an N e of 312 individuals at 150 generations (S3 Table). The effective population size estimation suggests a larger ancestral pool for PGOD compared to the other herding breeds, which is independent of the region of origin. This original population size is larger than the current population of PGODs, likely reflecting the changing need for sheepdogs in the region.
Potential migration events and gene flow between PGOD and the other ten breeds from the herding breed subset were investigated using the software Treemix v.1.12 [19]. Two maximum likelihood trees of the population without migration events and using the golden jackal as an outgroup were performed for region-specific PGODs and for the homogenous PGOD group.
We then incorporated events sequentially up to 10 migrations. The optimal number of migrations was estimated with the optM R package (S2 Fig). We determined that the most probable migratory events were m = 3 and m = 2, and we plotted the residual matrices for PGOD separated by regions and the homogenous PGOD group (S3 Fig). When analyzing PGOD separately by geographic origin, MGOD, YGOD and AGOD were placed on the same branch as BORD, KELP and BERD, and showed gene flow from GSD to both AGOD and YGOD in two migration events (S3A Fig). Interestingly, when we analyzed the group of homogenous PGODs, the algorithm again placed PGOD in the same branch as BORD, KELP and BERD, but also suggested gene flow from a common ancestor between UK herding dogs and GSD to PGOD in one migration event (S3B Fig). The proposed GSD admixture events were corroborated by D-statistics analysis (S2 Table).

Genome diversity
We estimated inbreeding coefficients in the breeds from the herding dog subset using PLINK v1.9 [20]. PGODs were separated by geographic region. We selected random groups of 10 individuals for YGOD and MGOD for the calculations, and one of these groups was randomly chosen for the graph (see "Materials and Methods", S4 Table). All groups of PGODs demonstrate lower levels of inbreeding when compared to the other herding breeds. The lowest mean inbreeding coefficients were observed as follows: AGOD: -0.02; MGOD: 0.02; and YGOD: 0.03 (S5 Table), while the highest was observed in COLL (0.45) (Fig 5A). Similar values were obtained when analyzing all PGODs in a single group (n = 159), where the inbreeding coefficients ranged from -0.04 to 0.27 with a mean of 0.03 (S6 Table). We also calculated nucleotide diversity as it provides an estimate of polymorphism within populations. Analysis was performed in 500-kb non-overlapping windows [21]. Again, random groups of 10 individuals were constructed for YGOD and MGOD and one group was randomly selected for the graph. Compared to the other herding dogs, the three PGOD populations showed the highest levels of nucleotide diversity (Fig 5B). Altogether, these results suggest that PGOD has the highest genetic diversity compared to the other herding breeds considered in this study.

Estimated date of herding breed divergence
As published previously, we adjusted a linear model of the relationship between the total length of shared haplotypes and the historical date of an admixture or divergence event, using nine pairs of breeds [5]. We estimated the slope and intercept that describe the relationship and used it to estimate the year of genomic divergence between each pair of breeds from the herding dog subset using the 13 homogenous PGODs (S7 Table). The PGOD samples utilized here were obtained in 2019 so the estimated years of divergence are considered relative to this date. The divergence dates of the PGOD with each UK herding breed are~149 years ago, whereas the divergence dates of the non-PGOD breed pairs are calculated to have occurred more recently [22][23][24] (Fig 6).

Discussion
The genomic characterization of rare regional dog populations has become a powerful tool for uncovering the demographic history of such populations, and can also be used to track the movement of human populations in the same regions [1,6,25,26]. The historical circumstances that prompted the migration of humans and their herding dogs from the UK to Patagonia in the mid-1800's, and the resulting genetic isolation of these populations, presents a valuable opportunity to explore the genetic implications of this shared journey.
In this study, we explored the relationship of PGODs with other dog breeds through phylogenetic and haplotype sharing analyses. We combined genome-wide marker data from 159 PGODs with genomic information from 1514 individuals representing 175 dog breeds and two wild canids published previously [4,5]. The phylogeny obtained assigned all 159 PGOD individuals to the previously defined UK rural breed clade [5]. However, they did not cluster as a single population (Fig 2). The relationship of PGOD with the UK rural clade is explained by the European migration history throughout Patagonia, with the initial arrival of three hundred sheep from the Falkland Islands, a UK territory, to the Magallanes region of Chile in 1877 [12,13]. This was followed by co-migration of people and sheep from the Falklands to Magallanes during the 1880s (Fig 1A). The colonization incorporated both techniques and standards used by traditional Anglo-Scottish herders, which are still maintained today in Southern Patagonia [9,13].
The migration of shepherds and their dogs continued towards the Aysén region in the Northern Chilean Patagonia through Argentina [9,27]. This historical migration route (Falklands-Magallanes-Argentina/Aysén) is reflected in the current geographic distribution of PGODs and their phylogeny, where two well-defined subpopulations of PGODs are identified, separated by their location in Magallanes and Aysén/Chubut to southern and northern Patagonia, respectively (Fig 2). The two PGOD subpopulations are separated by natural geographic barriers, the most significant being ice fields (Fig 1A). Despite their geographical separation, PGODs from both subpopulations are strikingly similar in appearance and behavior (Fig 1D), likely demonstrating selection for a common function. Validation of this historical selection can be found in documents related to the recruitment of workers from Scotland, where it was stipulated that a sheepherder had to bring with him one or two sheep dogs. Those with pups, signifying the potential for the dogs to reproduce, would receive preference for recruitment [9,28]. These early historical records describe an early settlement, for which human-animal partnerships were key to the success of pastoral activities.
Although phylogenetic trees can describe the relationship between breeds, they cannot provide a full understanding of breed development and evolution [29]. Therefore, to determine the placement of PGOD within the UK rural clade we analyzed hybridization across clades, looking for shared haplotypes between breeds. Despite the geographic isolation that has  [22][23][24]. While the German shepherd (GSD) is not considered part of the UK herding clade, it shows most recent haplotype sharing with the UK breeds between 1848 and 1855. White dashed lines indicate divergence date ranges calculated from the median length of haplotype sharing for each breed with the remaining UK herding breeds. The PGOD shows breed-to-breed divergence dates of between 1850 and 1870, prior to the genetic separation of most of the UK herding breeds. The COLL and SSHP do not show genetic divergence in this representation due to the limitations of predicting more recent divergence events with this dataset. impacted the PGOD, all three subpopulations share haplotypes with the same nine herding breeds from the UK rural clade (Fig 3), inferring that the exchange of haplotypes occurred before the arrival of PGOD to South America and its separation throughout Patagonia.
The dog breeds of the collie lineage share the same ancestors as working sheepdogs from Great Britain before the explosion of modern breeds [2]. Among breeds likely related to PGOD before and during the period of extensive Patagonian colonization, between 1877-1910, are the Old Welsh Grey and other historic UK herding dogs which have similar aptitudes for herding sheep [9,15]. However, these regional varieties disappeared from the UK at the end of the 19 th century as a result of industrialization and changes in trade and transportation patterns [9,14].
To further disentangle the relationship between PGOD and the UK herding breeds, we examined ancestry patterns using the clustering software Admixture (Fig 4A). The PGOD showed higher genetic heterogeneity compared to other herding breeds. Also, within PGOD, the MGOD subpopulation shows a smaller contribution from other breeds to its ancestry composition, likely due to their greater degree of geographic isolation. Most dogs from Southern Patagonia inhabit the island of Tierra del Fuego, which acts as a geographic barrier, reducing admixture with other breeds. Also, sub-structure within the PGOD provides evidence at the local level that the individuals from southern Patagonia have been distinct from YGOD and AGOD for long enough to acquire distinguishable genetic identities between them. When compared to MGOD, the ancestry composition of the AGOD and YGOD shows a greater contribution of BORD, likely due to recent admixture after BORD's introduction to Patagonia over the last 15 years.
It is possible that PGOD sub-populations would have experienced different levels of genetic drift, whereas the MGOD population seems to have been subjected to greater drift, as shown by the 13 dogs defined as "homogenous PGODs". Although, these dogs show a higher inbreeding coefficient than other PGOD sub-populations, the "homogenous PGODs" shows less inbreeding than the modern herding breeds analyzed here (S5 Table). This suggest that the bottlenecks experienced by modern herding breeds is stronger than those caused by the migration event of PGOD to Patagonia, and the subsequent local breeding and selection scheme.
When we explore the genomic structure using the homogenous PGODs from Magallanes, we observe that admixture analysis is not able to separate the PGOD, KELP, and BORD populations. The similar ancestry pattern exhibited by these breeds likely stems from shared common ancestry rather than from recent admixture ( Fig 4B).
Interestingly, a strong genetic signature from the PGOD/KELP/BORD group to the BERD and AUSS was observed (73% and 65%, respectively) (Fig 4B), providing further evidence of a common genetic signature shared across all herding breeds analyzed here. The close genetic relationship between these herding breeds probably occurred before the middle of the 19th century when the different types of herding dogs were not yet separated into distinct breeds, and breeding by shepherds focused solely on obtaining an intuitive and intelligent dog, independent of its pedigree. This system, with no specific selection for any morphologic traits, gave rise to a great diversity of coat type and color in herding dogs. However, particular lines developed in some regions, establishing themselves as local or niche breeds whose behaviors, ability to learn, and perhaps even morphologic features made them ideal for a particular terrain, weather pattern, or stock type [23].
In 1875 two broad types of sheepdog varieties existed: rough-coated and smooth-coated [30]. Subsequently, publications dated between 1885 and 1890 describe three different British varieties: the Scottish collie or rough-haired sheepdog, the smooth-haired sheepdog, and the old English short-tailed sheepdog [31]. These simple descriptions highlight the absence of formalized breed structure among herding dogs in the mid-1800's in the UK. It is not until the late 1800's that official breed clubs were formed for the select purpose of producing standardized dog breeds. These population splits are consistent with the haplotype sharing data from our IBD analyses, wherein the modern UK herding breeds show breed-to-breed divergence dates between 1863 and 1896. Conversely, the haplotype-derived divergence points of the PGOD from each of the modern herding breeds dates to between 1850 and 1870, overlapping with the known migration timeframe of herding dogs to Patagonia and occurring prior to the formation of the modern UK herding breeds (Fig 6). Indeed, the recorded breed origins of the modern UK herding breeds correspond to dates within 10-30 years of our IBD-derived divergence dates (Table 1). Further, analysis of human genetic admixture involving the native populations of Chilean and Argentinian Patagonia indicate a post-colonization influx of European ancestry, specifically from Great Britain that dates to between 1763 and 1931 [32,33]. This once again highlights the genomic patterns observable due to the parallel migration of humans and their dogs.
Our ADMIXTURE analysis revealed a genomic structure in the modern PGOD that is very similar to the genomic structure of BORD, BERD, and KELP. While these breeds are readily distinguishable through our other genetic analyses, this common genomic structure likely reflects the remnant influence of the old working sheep dogs of Great Britain (Fig 4B). Interestingly, the ADMIXTURE analysis shows a low contribution from GSD to the genomic structure of most PGODs (9% and 5% of ancestry in AGOD-YGOD and MGOD, respectively). This finding is reinforced by the Treemix analysis, where maximum likelihood trees inferred migration events from the GSD to AGOD and YGOD (S3A Fig). A migration event from a common ancestor between the UK herding breeds and GSD to the homogenous PGODs was also identified (S3B Fig). It is not surprising that the GSD, which was previously assigned to a group of breeds classified as the New World clade [5], could have a shared genetic history with the UK herding breeds. Previous studies demonstrate significant haplotype sharing among Italian herding breeds and GSD, with admixture/divergence events between 1859-1867 [4]. This suggests that herding behavior has arisen from multiple geographic and genetic backgrounds [5]. The influence of a pervasive common livestock dog from continental Europe, from which the GSD originated, has left an agrarian signature in many breeds [4], including PGOD, with which it has a divergence date of approximately 1850 (Figs 6 and S3B).
Genetic variability and the structure of a domestic breed depends largely on the breeders' decisions and practices [42]. When we analyzed PGOD's inbreeding coefficient and nucleotide diversity, the three PGOD populations showed a higher level of genetic diversity than other

PLOS GENETICS
breeds from the herding dog subset analyzed here (Fig 5). This agrees with PGOD's higher effective population size, which indicates that low levels of genetic drift have occurred in the PGOD population, allowing it to maintain high genetic diversity and a low level of inbreeding compared to other herding breeds. Indeed, we obtained a mean N e of 308 individuals for AGOD, 446 for YGOD, and 419 for MGOD at 200 to 300 generations before present, which exceeds the suggested minimum population size of 50 to 100 individuals for the establishment and recognition of a breed (http://www.fci.be/en/Standing-Orders-of-the-FCI-40.html). The higher genetic diversity and N e in PGOD may be explained by the fact that although these dogs are strongly selected for success in a pastoral environment; they are not under selection to conform to aesthetic standards [9]. Similar findings were described in endemic dog landrace populations from Italy [43].
To genetically characterize a landrace dog breed from Patagonia, our analyses also identified a lost reservoir of ancestral dogs reflecting the foundational population that ultimately gave rise to the modern UK herding breeds. Genomic analyses, coupled with historical documentation trace the origin of the PGOD to the UK, prior to the explosion of modern breeds in the Victorian era. The PGOD belongs to the clade of dogs that share UK heritage, typified by the modern BORD, BERD, and KELP breeds. We propose that the PGOD is the closest living representative of the common ancestor to the original UK herding breeds, mirroring how the foundational UK sheepdog looked and performed, and displaying skills that are preserved by the PGOD in modern Patagonia.

Ethics statement
The field sampling and study protocols were conducted and approved according to the Universidad Austral de Chile Institutional Animal Ethics Committee (N˚317/2018) and reviewed and approved by ANID-Chile (Agencia Nacional de Investigación y Desarrollo, Chile). The sample collection was authorized by owners with a signed consent, in accordance with National Human Genome Research Institute (NHGRI), Animal Care and Use Committee protocols.

Dog samples and DNA extraction
Whole blood samples were obtained from 159 Patagonian sheepdogs (PGOD). Based on owner knowledge, the sampled PGODs were unrelated to at least three generations within a sampling location. The sampled PGOD come from three provinces of the Southern Chilean regions of Magallanes and Antarctica (Magallanes, Tierra del Fuego, and Ú ltima Esperanza (n = 101)), four provinces of the Northern Aysén region (Coyhaique, Aysén, General Carrera and Capitán Prat (n = 51)), and Chubut province, Argentina (n = 7) (Fig 2B). Blood samples were collected by veterinarians through venipuncture of the cephalic vein (3 to 5 ml) from working dogs that comply with the Patagonian sheepdogs morphometric standards [15,44]. Blood samples were collected in acid citrate dextrose anticoagulant (ACD) tubes. Samples were stored at 4˚C prior DNA extraction, and extraction was performed for all blood samples using standard proteinase K/phenol-chloroform isolation methods. Finally, samples were stripped of identifiers, numerically coded, suspended in TE (10 mM Tris base, 0.01 mM EDTA), aliquoted and stored at -80˚C.

Data collection
A set of 159 PGODs was genotyped using the Illumina CanineHD Whole-Genome Genotyping BeadChip (San Diego, CA, USA), which has 173,662 potentially informative markers. This process was carried out at the National Human Genome Research Institute (NHGRI) of the National Institutes of Health (Bethesda, MD, USA). Genotype calls were conducted in Illumina Genome Studio, specifying a 90% call rate. In addition, genotyped samples were merged with a dataset of 175 breeds generated for a previous studies [4,5]. The final dataset included 150,069 SNPs.

Phylogenetic and genetic distances estimation
PLINK v1.9 [20] was used to calculate genetic distances between 1673 individuals, representing 176 breeds, including the PGOD population and two wild canids. The genetic distance was estimated using the "-distance" and the "1-ibs," "square," and "flat-missing" modifiers [20]. Neighbor-joining phylogeny and consensus tree calculation was built using the PHYLIP software package v3.698 [17] with 100 bootstraps, and golden jackal (GDJK) as an outgroup. The bootstrapped cladogram of the consensus tree was drawn in FigTree v.1.4.4 [45]. Breeds were assigned to clades and light blue was used to represent the UK rural clade, relative to the expected clade structure published previously [5]. The unrooted tree was built with the same method detailed above using the herding dog dataset (247 dogs from 11 breeds), but without the outgroup.

Shared haplotypes
Haplotype sharing was determined by identity-by-descent (IBD) estimations among individuals. This analysis was performed on the 176 dog breeds and two wild canids with Beagle v4.1 [46]. The dataset was analyzed in windows of 1,000 SNPs with an overlap of 25 SNPs. Haplotype sharing was considered significant when median values fell above the 95 th percentile of all across-clade breed pairs. Boxplots of haplotype sharing distributions between breeds were performed using R Core Team [47].

Population structure of patagonian sheepdogs
The genetic structure and the extent of admixture between PGOD and related herding breeds was evaluated through the model-based clustering algorithm implemented in the ADMIX-TURE software v1.3 [18]. To reduce prediction error, ADMIXTURE cross-validation (CV) was used to determine the optimal K-value, minimizing the CV error using the script described in the ADMIXTURE documentation [18]. K represents the number of populations assigned during each clustering run. To assess the population structure of the herding breeds including PGOD, we ran a PCA using the R package flashpcaR [48]. Gene flow was calculated by D-statistics using the R package ADMIXTOOLS [49].

Effective population size and estimation of migration events
To calculate the effective population size of herding dogs we used SNeP v1.1 [50]. PGODs were separated by geographic region. Because of the uneven sample size per region, we randomly selected (without replacement) groups of 10 individuals for YGOD and MGOD by using the sample function from base package in R. All AGOD individuals available were used for the calculations on this group.
The herding dog dataset was separated into unique files by breed and the "ped" and "map" files were created with the parameter "-recode12" in PLINK software v1.9. A maximum likelihood tree was constructed with Treemix v.1.12 [19]. The trees were produced by analyzing the data in windows of 1,000 SNPs using the flags -k 1000 and 1,000 bootstrap repeats using the flag -bootstrap 1000 parameter and allowing for one to ten migration events. To estimate the number of migration edges on a population tree we use the R package optM (https://cran.rproject.org/web/packages/OptM/index.html). We ran five iterations at each migration event as optM requires at least two iterations to be run for each value of the number of migration events (m value).

Intra-and inter-breed genomic diversity
Inbreeding coefficients (F) were calculated from 247 dogs from 11 herding breeds using the "-het" function of PLINK v.1.9 [20]. The maximum, minimum, and mean values were obtained. The breed-specific F value was determined by averaging individual F values for all dogs of a single breed. Nucleotide diversity was estimated in 500-kb non-overlapping windows with the VCFTools software v0.1.15, using the parameters: window_pi. [21]. As explained in the effective population size analysis, PGODs were separated by geographic region and the sample function in R was used for the random selection procedure. These results were graphed using the R package ggplot2 [47,51].

Estimated date of herding breed divergence
To calculate the number of years since shared genetic history observed between the herding breeds analyzed here, we used a linear model between the total length of haplotype sharing and the age of a known admixture or divergence event, occurring between 35 and 160 years before present [5]. We adjusted the model using nine pairs of breeds and applied this equation to the total shared haplotypes calculated from the genotyping data. We estimated the slope and intercept that describe the relationship and used it to estimate the year of genomic divergence between each pair of breeds (linear correlation r 2 = 1). We used the relationship equation y = -8736150x + 1501072917 on our herding data set, including the 13 homogenous PGODs, to determine historical time estimations, where y is the total shared haplotype length and x is the number of years. The PGOD samples utilized here were obtained in 2019 so the estimated years of divergence are considered relative to this date.  Table. Effective population size (N e ) of herding dogs. N e was obtained based on the calculation of linkage disequilibrium (LD) between SNP markers considering a timeframe of 13 to 150 generations ago. PGOD was separated by region of origin in AGOD, YGOD and MGOD. Random subsets of 10 dogs from YGOD and MGOD were generated to calculate N e .  Maximum likelihood trees show the most important migration events. Scale bar shows ten times the average standard error of the sample covariance matrix. The estimated migration between breeds and gene flow are shown according to by direction and weight (yellow to red = 0 to 0.5). (A) Maximum likelihood tree using three migration events within the herding dog subset and PGODs separated by region in AGOD, YGOD, and MGOD. The residual matrix is plotted from a TreeMix analysis under 3 migration events (m = 3). (B) Tree using two migration events within the herding dog subset and the homogenous PGODs. The residual matrix is plotted from a TreeMix analysis under 2 migration events (m = 2). The breed abbreviations correspond to S1 Table. (TIF)